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Abstract 

Background: For understanding cellular systems and biological networks, it is important to analyze functions and 
interactions of proteins and domains. Many methods for predicting protein-protein interactions have been 
developed. It is known that mutual information between residues at interacting sites can be higher than that at 
non-interacting sites. It is based on the thought that amino acid residues at interacting sites have coevolved with 
those at the corresponding residues in the partner proteins. Several studies have shown that such mutual 
information is useful for identifying contact residues in interacting proteins. 

Results: We propose novel methods using conditional random fields for predicting protein-protein interactions. 
We focus on the mutual information between residues, and combine it with conditional random fields. In the 
methods, protein-protein interactions are modeled using domain-domain interactions. We perform computational 
experiments using protein-protein interaction datasets for several organisms, and calculate AUC (Area Under ROC 
Curve) score. The results suggest that our proposed methods with and without mutual information outperform EM 
(Expectation Maximization) method proposed by Deng et al., which is one of the best predictors based on 
domain-domain interactions. 

Conclusions: We propose novel methods using conditional random fields with and without mutual information 
between domains. Our methods based on domain-domain interactions are useful for predicting protein-protein 
interactions. 




Background 

Understanding of protein functions and protein-protein 
interactions is one of important topics in the field of 
molecular biology and bioinformatics. Recently, many 
researchers have focused on the investigation of amino 
acid residues of proteins to reveal interactions and con- 
tacts between residues [1-4]. If residues at important sites 
for interactions between proteins are substituted in one 
protein, the corresponding residues in interacting partner 
proteins are expected to be also substituted by selection 
pressure. Otherwise, such mutated proteins may lose the 
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interactions. Fraser et al. confirmed that interacting pro- 
teins evolve at similar evolutionary rates by comparing 
putatively orthologous protein sequences between S. cere- 
visiae and C. elegans[5]. It means that substitutions for 
contact residues occur in both interacting proteins as 
long as the proteins keep interacting with each other. 
Therefore, mutual information (MI) between residues is 
useful for predicting protein-protein interactions for pro- 
teins of unknown function. MI is calculated from multi- 
ple sequence alignments for homologous protein 
sequences. Weigt et al. identified direct residue contacts 
between sensor kinase and response regulator proteins 
by message passing, which is an improvement of MI [4]. 
Burger and van Nimwegen used a dependence tree where 
a node corresponds to a position of amino acid 
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sequences, and predicted interactions using a Bayesian 
network method [2] . On the other hand, Markov random 
field and conditional random field models have been well 
studied in fields of natural language processing [6,7]. 
Also in bioinformatics, protein function prediction meth- 
ods from protein-protein interaction network and other 
biological networks were developed using Markov ran- 
dom fields [8,9]. On the other hand, several prediction 
methods have been developed based on domain-domain 
interactions. Deng et al. proposed a domain-based prob- 
abilistic model of protein-protein interactions, and devel- 
oped EM (Expectation Maximization) method [10]. 
Based on this probabilistic model, LP (Linear Program- 
ming)-based methods were developed [11], and Chen et 
al. improved the accuracy of interaction strength predic- 
tion by APM (Association Probabilistic Method) [12]. In 
this paper, we propose prediction methods based on 
domain-domain interactions using conditional random 
fields with and without mutual information. Further- 
more, we perform computational experiments for several 
protein-protein interaction datasets, compare the meth- 
ods with the EM method proposed by Deng et al. [10], 
which is one of the best predictors based on domain- 
domain interactions, and the association method pro- 
posed by Sprinzak and Margalit [13] (the APM method 
for binary interaction data is equivalent to the association 
method), and show that our methods outperform the EM 
method and the association method. 

Mutual information between domains 

In order to investigate the relationship between two 
positions of proteins, MI for distributions of amino 
acids at the positions is used. Such distributions can be 
obtained from multiple alignments of protein sequences 
and domain sequences. In this section, we briefly review 
MI for distributions of amino acids, and explain MI 
between domains. 



We assume that multiple sequence alignments for 
domains D m and D n are obtained, respectively (see Fig- 
ure 1). In order to calculate MI, we need joint appear- 
ance frequencies. However, we cannot see which 
sequence in the multiple alignment of domain D m corre- 
sponds to a specified sequence in that of D n . Therefore, 
we assume that sequences contained in the same organ- 
ism can be paired. In the example of Figure 1, the sec- 
ond sequence of D m is paired with the first one of D n , 
the third one of D m is paired with the second one of D w 
and so on. The first sequence of D m is not counted into 
the appearance frequencies because it is not paired with 
any sequence of D n although it may be paired with 
sequences of other domains than D n . 

Let A be a set of amino acids, fi(A) be the appearance 
frequency of amino acid A at position i in domains D m 
and D n , and f[j{A, B) be the joint appearance frequency 
of a pair of amino acids A at position i in D m and B at 
position / in D n , where each frequency is divided by the 
number of paired sequences M in the multiple align- 
ments SUch that I,A^Afi( A ) = ^A,B^AfiM' B ) = 1- 

Multiple alignments often include some gaps. Weigt et 
al. counted the frequencies of gaps as well as amino 
acids [4]. Therefore, we also consider gaps to be a kind 
of amino acids, that is, the number of distinct amino 
acids is |^4| = 21. Then, mutual information for posi- 
tions i in D m and j in D„ is defined as the Kullback-Lei- 
bler divergence between the multiplication of 
appearance frequencies, fi(A )fj{B), and the joint appear- 
ance frequencies, fij(A,B), as follows. 



MI H= X M A ' B ) lQ g 



A, Be A 



fijtA,B) 
fi{A)fj{Bj 



(1) 



If frequency distributions of amino acids at positions i 
and j are independent from each other, fij(A,B) x fi(A)fj 
(B), and MIij approaches to zero. This means that the 
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Figure 1 Illustration on the calculation of mutual information from multiple alignments of domains Domains D m and D„ have multiple 
alignments of sequences from several organisms, respectively. Mutual information is calculated for each pair of positions /' and /. 
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two positions are not related with each other in the evo- 
lutionary process. If domains D m and D n interact at the 
positions, it is considered that Mljj becomes high 
because the positions have coevolved through the evolu- 
tionary process in order to keep the interaction. It 
should be noted that two positions i and / do not always 
directly interact even if MI^ is high [4] . However, such 
proteins with high values of MI have a possibility to 
directly interact with each other at other positions in 
the proteins. 

However, we need to reduce M//y because it can be 
unnecessarily high depending on distributions of fj(A) 
and fj(B). For that purpose, we make use of Ml^ andom ^ , 
which is the mutual information Mlj from the joint 
frequency, fij{A, B), obtained by shuffling at random 
the combinations of sequences in multiple alignments. 
In this paper, we repeat the procedure 400 times 
according to [4], and take the average. For practical 
uses of MI, fi{A), fj{B) and ftj{A,B) should be positive 
values. Otherwise, we cannot calculate MI t j by using 
computers. Therefore, we use the following pseudo- 
count as in [4], 



fi 



(pseudo) 



(A): 



\A\t] + M 



fjf seud0 \A,B)- 



tj /\A\ + f lj {A,B)M 
\A\r] + M 



(2) 



(3) 



where t] is a constant value, in this paper we use f] = 
1. It should be noted that the sum over all amino acids 



to be the maximum of MI over all positions i and / as 
follows. 



M =max(M/„ - { Mlj^ )), 



(4) 



where (v) means the average of v, i and j are positions of 
D m and D n , respectively. Since Mly is calculated to be high 
for the positions i and /' that include many gaps, we exclude 
positions that include more than 20% gaps as in [14]. 

Conditional random field model for PPI 

In this section, we propose a probabilistic model for 
protein-protein and domain-domain interactions using 
conditional random fields [6,7] because it can be consid- 
ered that two domains D m and D n do not always inter- 
act even if the mutual information M mn is large. For 
example, Weigt et al. improved MI and proposed direct 
information (DI) because residues do not always contact 
with each other even if the MI is large [4]. Most pro- 
teins contain domains as is well known. If two proteins 
do not interact with each other, any two domains con- 
tained in the proteins must not interact with each other. 
In the left example of Figure 2, protein P t consists of 
domains D\ and D 2 and protein Pj consists of domain 
D 3 respectively. If Pi and Pj do not interact, any pair of 
(£>!, D 3 ) and (D 3 , D 3 ) does not interact. Deng et al. pro- 
posed a probabilistic model for a pair of proteins as fol- 
lows [10]. By assuming that proteins and Pj interact if 
and only if at least a pair of domains included in the 
proteins interacts, and events that domains interact are 
independent from each other, they defined 



A> £ MA /r*"M = i and X^/f^^i because Pr(p = l) = l - TT (l - Pr{D mn = 1)), 



jflA) = Za^MH) = l. 

In order to investigate interactions between proteins, 
we need MI between domains included in the proteins. 
Thus, we define MI between domains D m and D w M mn , 



D„,„eP ti 



(5) 



where Pu = 1 means that proteins P, and Pi interact, 
D mn = 1 means that domains D m and D n interact, D mn 




Random variables on domain pairs 



Figure 2 Markov random field model for protein-protein interactions Left: Example of proteins P, and Pj. P, consists of domains D, and D 2 , 
and Pj consists of domain D 3 , respectively. Right: Factor graph G{U,V,E). There exists an edge between Pq g U and D mn e V if and only if D mn e Pj. 
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g P t j means that domain D m is included in protein P t 
and D n is included in Pj and the product in the right 
hand side is calculated for all domain pairs {D m , D n ) 
included in the protein pair (P it Pj). By transforming 
equation (5), we have 



1-^=1)= ] [ {l-Pr{D n 



exp 



D„,„eP„ 



I (ran) 



(6) 



(7) 



where X (mn) = log(l - Pr(D m „ = 1)). 

From this equation, we can consider the following 
Markov random field model for protein pair (P it Pj) (see 
Figure 2). 



MM) = — exp 



III rt (ro0 (P,^ m j 

P„ O„,„EP„le{0,l} 



(9) 



where p means a set of events on protein-protein 
interactions, P,y = py. 

We here introduce mutual information between 
domains M = {M mn } as given conditional data in order 
to combine it with the probabilistic model. Then, equa- 
tion (9) can be written as 



Pr(p ij \M) = 



exp 



D„„,ei' 9 'e{0,l} 



where 



P 9 6{0,1} 



I I 4" m) f M m „) 

D m ^Pij <e{0,l} 



(10) 



(11) 



Pr{P i j=P lj ,d) = —ex P 



X X 4I'" m) /f" m) (P(f' d » n ) 

D n|(| EP 1 y S,re{0,l} 



(8) 



where /ty e {0, 1}, d means a set of events on domain- 
domain interactions, D mn = d mn {d mn e {0, 1}), 
flf mn \piyd mn ) denotes a local feature, X^ mn) is the 
corresponding weight parameter and related to the joint 
probability Pr(P,y = s, D mn = t), and Z,y denotes the nor- 
malization constant. For instance, equation (8) for p^ = 

0 is equivalent to equation (7) in the case that 
XVhtrm) = A (mn) for aU protein pairs p.) and 

/s,t ' (Pij'dmn) = 1 if s = ^ = 0, otherwise 0. 

In Markov random fields, random variables have Mar- 
kov properties represented as an undirected graph [15]. 
The factor graph for our model is represented to be a 
bipartite graph G{U, V, E) with a set of vertices U corre- 
sponding to protein-protein interactions Py, a set of ver- 
tices V corresponding to domain-domain interactions 
D mn , and a set of edges E between U and V as the right 
figure of Figure 2. There exists an edge between P^ e U 
and D mn e V if and only if D mn e Py. For the left exam- 
ple of Figure 2, protein pair (P it Pj) includes domain 
pairs (£>!, D 3 ) and (D 2 , D 3 ). Then, in the factor graph, 
the vertex of Pg is connected with vertices of D 13 and 
D 23 , respectively. Although the vertex of Pu does not 
have other adjacent vertices than the vertices of D 13 and 
D 23 , those of D 13 and D 23 can be connected with other 
vertices than that of Py 

Since /V(P, y = 0\D mn =t) = 1 - PriPy = l\D mn = t), it is 
redundant to consider both s = 0, 1, and it is sufficient to 
consider only 5 = 1. Therefore, in order to simplify the 
model, we substitute = l[ mn ^ , f$ mn) = // m "> , 

and / 0 ( f mn) = 0 for all protein pairs (P„ Pj). Then, we 
have the following joint probability, 



fl m "\ Pijl M mn )- 



°( M mn - c) (if = 1 and t = 1) 

cr(c-M mn ) (if p 0 = 0 and t = 0) 

0 (if = 1 and t = 0) ' 

-1 (if pij = 0 and t = 1) 



(12) 



a{x) = 1/(1 + e~ x ) is an increasing function, and c is a 
positive constant. It should be noted that a negative 
value, -1, is given to // m "' because it is undesired that 
a pair of domains interact although proteins having the 
pair do not interact. In this way, the local feature j( mn ) 
correlates protein-protein interactions Pu with domain- 
domain interactions D mn (see Figure 2). 

For a conditional random field model without MI, we 
use the following local feature instead of 
f t {mn \p,iM mn ). 



fl mn \Pii,d mn )- 



1 {if Pij = t) 

0 (if p^ = 1 and t = 0) . 

-1 (if pa = 0 and t = 1) 



(13) 



Parameter estimation 

In this section, we discuss how to estimate the para- 
meters A = {X^'} • We assume that protein-protein 
interaction data p = {pij\ are given. Then, the likelihood 
function is represented by 



p(p|M)=n p (Pijl M ) = zn 



Z(M) 



S S S ^f^M m ) (14) 

p 9 epD m eP s te{0,l] I 



where Z(M) = R p .^ p zij(-M)- By taking the logarithm, 
we have 



1(A) = log P(p I M) = £ £ £ A(™»//"'" ) (P, j .M™)-log2 i ,(M) 

f^ep D mn eP if re{0,l} 



(15) 
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We estimate the parameters by maximizing the log-like- 
lihood function, 1$). Since log(e* + e y ) is a convex function 
for variables x and y, that is, 1{X) is a concave function, we 
are able to obtain a global maximum. For maximizing 
such functions, various methods such as the steepest des- 
cent method, Newton's method, and the Broyden- 
Fletcher-Goldfarb-Shanno (BFGS) [16] method have been 
developed. Newton's method calculates the inverse of the 
Hessian matrix for the objective function. However, the 
computational cost is high. Therefore, the quasi-Newton 
method approximates the matrix by some efficient method 
using the first derivatives, the gradient. In this paper, we 
use the BFGS method, which is one of the quasi-Newton 
methods. By differentiating equation (15) partially with 
respect to each parameter x[ mn ^ > we have 



81(A) 

3A,t™> 



p,e{0,l} 



(16) 



In the BFGS method, this equation is repeatedly 
applied for updating a solution. 

Computational experiments 

Data and implementation 

We used protein-protein interaction data of H. sapiens, 
D. melanogaster, and C. elegans from the DIP database 
[17], the file name is 'dip20091230.txt'. We used the 
UniProt Knowledgebase database (version 15.4) [18] as 
protein domain inclusion data. We deleted proteins that 



did not have any domain, and obtained 294 interacting 
protein pairs as positive data that included 300 distinct 
proteins and 320 domains for H. sapiens, 449 interacting 
pairs that included 562 proteins and 449 domains for D. 
melanogaster, and 250 interacting pairs that included 
602 proteins and 476 domains for C. elegans. 

We used the Pfam database (version 24.0) [19] to 
obtain multiple sequence alignments for domains, and 
calculated MI, M mn , for each pair of domains. Figure 3 
shows the distributions of domain MI M mn for H. 
sapiens, D. melanogaster, and C. elegans. We can see 
from the figure that most domain Mis are distributed in 
the part of less than about 0.8 for all organisms. It is 
considered that domains D m and D n with M mn less than 
0.8 may not interact, and domains with M mn more than 
0.8 have more possibilities to interact with each other. 
Therefore, we set the constant c in equation (12) to be 
0.8. Although we tried several values from 0.6 to 1.0 for 
c, the results were similar to the case of c = 0.8. 

We selected non-interacting protein pairs as negative 
data uniformly at random such that negative data did 
not overlap with the positive data. The number of nega- 
tive data was the same as that of positive data for each 
organism. 

We used libLBFGS (version 1.9) with default para- 
meters to estimate the parameters A,'" 1 "' , which is a C 
implementation of the limited memory BFGS method 
[20], and is available on the web page, http://www.chok- 
kan.org/ software/liblbfgs/. 
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Table 1 The AUC results for training and test datasets of H. sapiens by the CRF method with Ml, that without Ml, the 
EM method, and the association method 



iteration 


CRF with Ml 


CRF without Ml 




EM 


Assoc 






training 


test 


training 


test 


training 


test 


training 


test 


1st 


0.999366 


0.989247 


0.999366 


0.881720 


0.999819 


0.709677 


0.999602 


0.709677 


2nd 


0.998787 


0.919355 


0.999312 


0.923387 


0.999909 


0.875000 


0.999330 


0.854839 


3rd 


1 .000000 


0.847222 


1 .000000 


0.833333 


1.000000 


0.861 1 1 1 


1 .000000 


0.861 1 1 1 


4th 


0.999351 


0.989583 


0.999369 


1 .000000 


0.999856 


0.989583 


0.999351 


0.989583 


5th 


0.999333 


0.842365 


0.999369 


0.827586 


0.999982 


0.798030 


0.999802 


0.798030 


average 


0.999367 


0.917554 


0.999483 


0.893205 


0.999913 


0.846680 


0.99961 7 


0.842648 



Result 

In order to evaluate our method, we compared the pro- 
posed CRF method with MI and that without MI with 
the EM method by Deng et al. [10] and the association 
method proposed by Sprinzak and Margalit [13]. The 
association method and the APM method [12] estimate 
probabilities X mn that domains D m and D n interact as 

i~ Y . (l-a-p^'b 



N„ 



and 



A. 



respectively, where N mn (I m „) denotes the number of 
(interacting) protein pairs that include domain pair (D m , 
D n ), and p« denotes the interaction strength of protein 
pair {P h Pj), 0 < pi) < 1. However, our input interaction 
data are binary, that is, p« takes only 0 or 1. Then, the 
numerator of the APM method becomes I mn . It means 
that the APM method for binary interaction data is 
equivalent to the association method. In the EM 
method, probabilities X mn that domains D m and D n 
interact are estimated by the recursive formula, 

(l-fnpfn [1 -°* ] 



i(0 



N 



inn 



whe 



= 1 denotes that it was observed that proteins Pi and Pj 
interact with each other, and fit = 0.8. In this paper, the 
solution of the association method was given as the 

initial value X (0) of the EM method. 



We performed five-fold cross-validation, that is, split 
the data into 5 datasets (4 for training and 1 for test), 
estimated X} mn ' from the training datasets, and calcu- 
lated Pr(Pjj = 1\M) of equation (10) for each protein 
pair in the test dataset and AUC (Area Under ROC 
Curve) score, where among the test dataset only protein 
pairs that included at least a parameter estimated from 
the corresponding training dataset were always used. 
We repeated 5 times, and took the average. Tables 1, 2, 
and 3 show the results on AUC for training and test 
datasets by the CRF method with MI, that without MI, 
the EM method, and the association method for H. 
sapiens, D. melanogaster, and C. elegans, respectively. 
An AUC score is the area under an ROC (Receiver 
Operating Characteristic) curve, and takes a value 
between 0 and 1. The ROC curve of a random classifier 
lies on the diagonal line, and the AUC score is 0.5. The 
ROC curve of a perfect classifier goes through the point 
(0 (false positive rate), 1 (true positive rate)), and the 
AUC score is 1. A classifier with the AUC score closer 
to 1 has better performance. We can see from these 
tables that the results by the CRF method with MI are 
better than those by the CRF method without MI, and 
that the results by the CRF method without MI are bet- 
ter than those by the EM method and the association 
method. It is also seen that the results by the EM 
method are almost the same as those by the association 
method. It might be because the parameters of the EM 
method were estimated from the solution of the 



Table 2 The AUC results for training and test datasets of D. melanogaster by the CRF method with Ml, that without 
Ml, the EM method, and the association method 



iteration 


CRF with Ml 


CRF without Ml 




EM 


Assoc 






training 


test 


training 


test 


training 


test 


training 


test 


1st 


0.999255 


0.707692 


0.999977 


0.738462 


0.999961 


0.769231 


0.999938 


0.769231 


2nd 


0.997928 


0.818182 


0.997905 


0.848485 


0.999938 


0.727273 


0.999736 


0.727273 


3rd 


0.997920 


0.708333 


0.997920 


0.562500 


0.999922 


0.645833 


0.999884 


0.625000 


4th 


0.998660 


0.863636 


0.999318 


0.886364 


0.999814 


0.840909 


0.999853 


0.840909 


5th 


0.999234 


0.819444 


0.999954 


0.833333 


0.999861 


0.527778 


0.999923 


0.527778 


average 


0.998599 


0.783458 


0.999015 


0.773829 


0.999899 


0.702205 


0.999867 


0.698038 
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Table 3 The AUC results for training and test datasets of C. elegans by the CRF method with Ml, that without Ml, the 
EM method, and the association method 



iteration 


CRF with Ml 


CRF without Ml 




EM 


Assoc 






training 


test 


training 


test 


training 


test 


training 


test 


1st 


0.999975 


0.657143 


0.999975 


0.514286 


1.000000 


0.542857 


1 .000000 


0.542857 


2nd 


0.997899 


0.923077 


0.996873 


0.948718 


0.999875 


0.743590 


0.999825 


0.743590 


3rd 


0.998775 


0.900000 


0.998825 


0.933333 


0.999875 


0.866667 


0.999825 


0.866667 


4th 


0.998950 


0.966667 


0.999850 


0.966667 


0.999850 


0.633333 


0.999850 


0.633333 


5th 


0.998900 


1 .000000 


0.998875 


1 .000000 


0.999675 


1 .000000 


0.999700 


1 .000000 


average 


0.998900 


0.889377 


0.998879 


0.872601 


0.999855 


0.757289 


0.999840 


0.757289 



association method and the solution of the EM method 
already reached a local optimum. Figures 4, 5, and 6 
show the average ROC curves for training and test data- 
sets by the CRF method with MI, that without MI, the 
EM method, and the association method. For training 
datasets, the results by all of the methods were almost 
perfect. For test datasets, the CRF method with MI out- 
performed that without MI, the EM method, and the 
association method. It should be noted that the ROC 
curves of the EM method are almost the same as those 
of the association method for the same reason discussed 
above. 

Conclusions 

We proposed novel methods which combine conditional 
random fields with the domain-based model of protein- 



protein interactions. In order to give better performance, 
we introduced mutual information to the probabilistic 
model. In the improved model, mutual information 
between domains is given as conditions, where MI 
between domains is defined as the maximum of Mis 
between residues in the domains. This method was 
developed based on the fact that amino acid residues at 
important sites for interactions have coevolved with 
each other, and Ml has been used for identifying contact 
residues in interactions. We performed five-fold cross- 
validation experiments, and calculated AUC for prob- 
abilities that two proteins interact. The results suggested 
that our proposed methods, especially the CRF method 
with mutual information, are useful. However, the 
results of AUC for training datasets implied that esti- 
mated parameters were overfitting to training datasets. 
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Figure 4 Average ROC curves for test datasets of H. sapiens by the CRF method with Ml, that without Ml, the EM method, and the 
association method 
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Figure 5 Average ROC curves for test datasets of D. melanogaster by the CRF method with Ml, that without Ml, the EM method, and 
the association method 
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Figure 6 Average ROC curves for test datasets of C. elegans by the CRF method with Ml, that without Ml, the EM method, and the 
association method 



Hayashida et al. BMC Systems Biology 201 1, 5(Suppl 1)58 
http://www.biomedcentral.eom/1752-0509/5/S1/S8 



Page 9 of 9 



For avoiding that problem, we can improve the meth- 
ods, for instance, by adding regularization terms, 
norm of parameters to the log-likelihood function. Since 
CRF has an advantage to be able to incorporate large 
number of features, it remains as a future work to 
improve the model itself to obtain better accuracy by, 
for instance, modifying the local feature and adding new 
features. 
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